Count-Based Exploration with Neural Density Models
نویسندگان
چکیده
Bellemare et al. (2016) introduced the notion of a pseudo-count, derived from a density model, to generalize count-based exploration to nontabular reinforcement learning. This pseudocount was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma’s Revenge. We consider two questions left open by their work: First, how important is the quality of the density model for exploration? Second, what role does the Monte Carlo update play in exploration? We answer the first question by demonstrating the use of PixelCNN, an advanced neural density model for images, to supply a pseudo-count. In particular, we examine the intrinsic difficulties in adapting Bellemare et al.’s approach when assumptions about the model are violated. The result is a more practical and general algorithm requiring no special apparatus. We combine PixelCNN pseudo-counts with different agent architectures to dramatically improve the state of the art on several hard Atari games. One surprising finding is that the mixed Monte Carlo update is a powerful facilitator of exploration in the sparsest of settings, including Montezuma’s Revenge.
منابع مشابه
Unifying Count-Based Exploration and Intrinsic Motivation
We consider an agent’s uncertainty about its environment and the problem of generalizing this uncertainty across states. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary ...
متن کاملMODELING FLEXURAL STRENGTH OF EPS LIGHTWEIGHT CONCRETE USING REGRESSION, NEURAL NETWORK AND ANFIS
Lightweight concrete (LWC) is a kind of concrete that made of lightweight aggregates or gas bubbles. These aggregates could be natural or artificial, and expanded polystyrene (EPS) lightweight concrete is the most interesting lightweight concrete and has good mechanical properties. Bulk density of this kind of concrete is between 300-2000 kg/m3. In this paper flexural strength of EPS is modeled...
متن کاملRock Brittleness Prediction Using Geomechanical Properties of Hamekasi Limestone: Regression and Artificial Neural Networks Analysis
The cold climate is a favorable parameter for the development of tension cracks and decrease of rock brittleness. Therefore, this paper attempts to investigate the Hamekasi porous limestone in order to predict the brittleness indices during freeze-thaw cycles. The freeze–thaw test was executed for one cycle including 16 h of freezing, and 8 h of thawing. The geo mechanical properties and brittl...
متن کاملFitting of Count Time Series Models on the Number of Patients Referred to Addiction Treatment Centers in Semnan County
Abstract. Count data over time are observed in many application areas. Many researchers use time series patterns to analyze this data. In this paper, the poisson count time series linear models and negative binomials on this type of data with the explanatory variables are studied. The Likelihood analysis and the evaluation of count time series model based on generalized linear models are pres...
متن کاملاستفاده از شبکه عصبی مصنوعی در برآورد حجم در جای هیدروکربن
Accurate estimation of hydrocarbon volume in a reservoir is important due to future development and investment on that reservoir. Estimation of Oil and Gas reservoirs continues from exploration to end of reservoir time life and is usual upstream engineer’s involvements. In this study we tried to make reservoir properties models (porosity and water saturation) and estimate reservoir volume hydro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017